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The Illumina Infinium HumanMethylation450 BeadChip — 
the successor to their hugely popular HumanMethylation27 
BeadChip — is arguably the most prevalent platform for large- 
scale studies of DNA methylome analysis. After the success of 
last year's meeting 1 that discussed initial analysis strategies 
for this then-new platform, this year's meeting (held at Queen 
Mary, University of London) included the presentation of 
now established pipelines and normalization methods for 
data analysis, as well as some exciting tools for down-stream 
analysis. The importance of defining cell composition was 
a new topic mentioned by most speakers. The epigenome 
varies between cell types and insuring that methylation 
differences are related to sample treatment and not a differing 
cell population is essential. The meeting was attended by 215 
computational and bench scientists from 18 countries. There 
were 11 speakers, a small poster session, and a discussion 
session. Talks were recorded and are now freely available at 
http://www.illumina.com/applications/epigenetics/array- 
based_methylation_analysis/methylation-array-analysis- 
education.ilmn 



Approaches to 450k Analysis 

Christoph Bock (CeMM Research Center for Molecular 
Medicine of the Austrian Academy of Sciences, Austria) was one 
of two invited speakers and gave an overview of DNA methyl- 
ation data in general terms as well as the number of different 
assays available. Dr Bock highlighted the 450k array's accuracy 
and similarity to genotyping applications in addition to having 
a quick and easy analysis workflow compared with sequencing 
based methods and to this end briefly presented a 450k analy- 
sis tool developed in his group, RnBeads (http://rnbeads.bioinf. 
mpi-inf.mpg.de). Other tools mentioned, in which Dr Bock is a 
contributor, included EpiExplorer, EpiGRAPH, BiQ Analyzer, 
and MethMarker. Finally, Dr Bock presented GoDMC (Genetics 
of DNA Methylation Consortium) that facilitates genome-wide 
association studies of DNA methylation. This consortium has 
been organized to bring together researchers studying the genetic 
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basis of DNA methylation and provide a centralized hub for 
coordinating analysis, summary statistics, replication, and 
meta-analysis. 

In addition to RnBeads, a number of other software packages 
have been developed for 450k data analysis. Chloe Wong (Kings 
College London, UK) presented their recently published pack- 
age, Watermelon, 2 available from Bioconductor. Watermelon pro- 
vides a number of QC steps including the use of SNP probes for 
sample relatedness and identification of possible sample mix-ups. 
The package provides a wide variety of normalization options 
and they have compared 15 types of preprocessing methods in 
which they found their own Dasen method as the most effective. 
However, they encourage users to experiment and determine the 
method best suited to their data. 

Tiffany Morris (Cancer Institute, University College London, 
UK) presented the recently released ChAMP package, an 
R-package available for download at http://www2.cancer.ucl. 
ac.uk/medicalgenomics/champ/. This package offers novel 
methods for downstream analysis. These include a function 
for batch effect estimation using singular-value decomposition 
(SVD) that highlights technical variation; a new way to call dif- 
ferentially methylated regions (DMRs) using a feature-oriented 
dynamic window that aims to capture neighboring significant 
probes; and also a function for estimating copy number aberra- 
tions (CNAs) in 450k data. 

Kasper Hansen (John Hopkins Bloomberg School of Public 
Health, USA) presented the minfi package available from 
Bioconductor. Dr Hansen gave an overview of the currently 
available functions for QC, normalization, and also mentioned 
functions that would be released in the near future for predicting 
the sex of samples to highlight mislabeled samples, a function to 
estimate cell composition and a DMR caller. Dr Hansen particu- 
larly emphasized the importance of considering sample mix-ups, 
especially in large EWAS studies as up to 10% of samples can be 
mislabeled. 

Data Integration and Methylation Studies 

Kim Siegmund (University of Southern California, USA) was 
the 2nd invited speaker and she discussed her implementation 
of the methylumi package available from Bioconductor. Dr 
Siegmund focused on the importance of background correction 
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(of both Type I and Type II probes) and dye bias adjustment 
(of Type II probes) and emphasized the impact of background 
noise on B values, principally for studies where small changes 
are expected. In these studies it is essential to properly process 
signals. Dr Siegmund has found that the removal of background 
noise and adjustment for dye bias removes the commonly 
observed trend of increasing signals across an array. As such, Dr 
Siegmund is confident that there is now a strong understanding 
of normalization methods, however, it is important to next focus 
on differentiating cell types. She ended by giving an overview 
of publically available data sets from The Cancer Genome Atlas 
(TCGA). 

Francesco Marabita (Karolinska Institute, Sweden) presented 
recently published work 3 of his evaluation of six different analysis 
pipelines that differ in normalization method. A number of pipe- 
lines now exist and it is important to evaluate them. He used two 
unpublished and two published data sets to make the comparison 
and found quantile normalization and Beta Mixture Quantile 
dilation (BMIQ) to be the most effective methods. 

Robert Lowe (Queen Mary University of London, UK) pre- 
sented his recent work on a new public database for Illumina 
450k data known as Marmal-Aid (http://marmal-aid.org/). This 
allows easy access to all publically available 450k data (currently 
8000+ samples). The data are extracted from public repositories 
and a number of pre-processing steps are applied such as data 
imputation and normalization. He showed that samples cluster 
by tissue rather than batch that suggests methylation differences 
of >10% were detectable above the influence of batch. 

Roderick Slieker (Leiden University Medical Center, The 
Netherlands) presented a method for characterizing tissue spe- 
cific Differentially Methylated Regions (tDMRs) in two inde- 
pendent data sets of four peripheral tissues (blood, saliva, buccal 
swab, and hair follicles) and six internal tissues (liver, muscle, 
pancreas, subcutaneous fat, omentum, and spleen with paired 
blood). Of the tDMRs, 13% mapped to gene body CpG islands 
and 25% to CpG islands shores. Implementation of annota- 
tions that recently became available through ENCODE showed 
enrichment of tDMRs in DNase hypersensitive sites and tran- 
scription factor binding sites. 

Amy Webster (Manchester University, UK) presented results 
in relation to her study on differential methylation related to 
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response of etanercept in patients with rheumatoid arthritis. A 
huge advance in the treatment of rheumatoid arthritis (RA) has 
been the introduction of biologic drug therapies; however, 40% 
of patients fail to respond. They chose 24 patients to identify 
a methylation signature indicative of response and found nine 
probes that showed significant differences. While the sample size 
was small this preliminary data indicate a possible methylation 
biomarker of response. 

The program was brought to a satisfying close following 
two talks given by representatives from Illumina: Bret Barnes 
(Illumina, USA) explained clearly the rationale behind the 
design of the 450k array and the requirement for a mixture of 
probe types, while Fraz Syed (Illumina, USA) discussed a whole 
genome bisulfite sequencing library preparation kit. 

Discussion and Future Considerations 

It is clear that analysis pipelines for the 450k have evolved over the 
past year. While last year's talks focused heavily on Type 1 and 
Type 2 differences, these appear less of a concern now. Details 
of the preprocessing methods are very important, particularly in 
EWAS studies and a number of effective methods are now avail- 
able. One issue that was mentioned in a number of talks was the 
importance of knowing the cell type composition in an experi- 
ment. Dr Hansen highlighted this particularly effectively during 
his talk, illustrating the point by showing aging data where new- 
borns have a significantly different blood composition to that of 
the elderly and hence correcting for this is essential before look- 
ing for changes in methylation between the two groups. 

Readers wishing to keep informed of issues related to the 
Illumina 450k platform are encouraged to visit the actively-used 
web forum set up last year, http://groups.google.com/group/ 
epigenomicsforum 
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